home *** CD-ROM | disk | FTP | other *** search
Text File | 1997-09-11 | 48.6 KB | 1,123 lines |
-
-
-
- - 1 -
-
-
-
- 1. _P_a_t_c_h__S_G_0_0_0_2_2_1_1__R_e_l_e_a_s_e__N_o_t_e
-
- This release note describes patch SG0002211 to IRIX 6.4.
-
- Patch SG0002211 replaces patches(es) : SG0001815, SG0001856,
- SG0001954, SG0001978, SG0002056, SG0002117, and SG0002121
-
- 1.1 _S_u_p_p_o_r_t_e_d__H_a_r_d_w_a_r_e__P_l_a_t_f_o_r_m_s
-
- This patch contains bug fixes for IP27 and IP30 Platforms.
- The software cannot be installed on other configurations.
-
-
- 1.2 _S_u_p_p_o_r_t_e_d__S_o_f_t_w_a_r_e__P_l_a_t_f_o_r_m_s
-
- This patch contains bug fixes for IRIX 6.4 (version
- 1263561140) The software cannot be installed on other
- configurations.
-
- 1.3 _B_u_g_s__F_i_x_e_d__b_y__P_a_t_c_h__S_G_0_0_0_2_2_1_1
-
- This patch contains fixes for the following bugs in IRIX
- 6.4. Bug numbers from Silicon Graphics bug tracking system
- are included for reference. For bugs fixed in prior
- patches, fix descriptions are grouped under the replaced
- patches.
-
- +o Bug #459567 : Memory mapping of files with DMAPI
- managed regions did not trigger the correct DMAPI
- events for xfs file systems mounted "-o dmi".
-
- +o Bug #484792 : mmap errors for file offsets > 2 GByte.
-
- +o Bug #494445 : prctl(PR_SETEXITSIG, signal) doesn't
- provide the semantics needed by most multi-threaded
- applications. The semantics of PR_SETEXITSIG were
- defined at a time when parallel Fortran codes were the
- order of the day. In that world, if any thread exited
- the application for any reason whatsoever, the
- application needed to terminate. With multi-threaded
- applications there is still the desire to terminate the
- application if any of the threads terminate abnormally,
- but calls to exit() and exec() by a thread shouldn't
- cause application termination. This patch adds a new
- prctl(PR_SETABORTSIG, signal) which does exactly that.
- If any thread aborts due to a signal, the share group
- will be sent the specified signal. On the other hand,
- if a thread exits the share group via a call to exit()
- or exec() the signal will not be sent. PR_SETEXITSIG
- and PR_SETABORTSIG are mutually exclusive; setting
- either one will nullify any previous setting of the
-
-
-
-
-
-
-
-
-
-
-
- - 2 -
-
-
-
- other. As with PR_SETEXITSIG, doing a
- prctl(PR_SETABORTSIG, 0) disables the abort signal
- processing.
-
- +o Bug #458133 : LLLLaaaarrrrggggeeee ppppaaaaggggeeee ttttuuuunnnneeeeaaaabbbblllleeeessss sssshhhhoooouuuulllldddd nnnnooootttt hhhhaaaavvvveeee aaaannnnyyyy
- lllliiiimmmmiiiittttssss....
-
- The large page tuneables (nlpages_*) used for reserving
- large pages at boot time had a limit of 64. This limit
- does not make sense and hampers kernel configurations
- for databases. The limits should be enforced based on
- the total memory in the system. The bug fix removes the
- max limits.
-
- +o Bug #473859 : TTTTuuuunnnneeeeaaaabbbblllleeee ttttoooo ttttuuuurrrrnnnn ooooffffffff mmmmmmmmaaaapppp ppppeeeerrrrffffoooorrrrmmmmaaaannnncccceeee
- ooooppppttttiiiimmmmiiiizzzzaaaattttiiiioooonnnn ffffoooorrrr wwwwoooorrrrkkkkssssttttaaaattttiiiioooonnnnssss....
-
- TTTThhhhiiiissss bbbbuuuugggg aaaaddddddddssss aaaa ttttuuuunnnneeeeaaaabbbblllleeee eeeennnnaaaabbbblllleeee____ddddeeeevvvvzzzzeeeerrrroooo____oooopppptttt ttttoooo ttttuuuurrrrnnnn ooooffffffff
- tttthhhheeee rrrreeeeggggiiiioooonnnn ccccooooaaaalllleeeesssscccciiiinnnngggg ooooppppttttiiiimmmmiiiizzzzaaaattttiiiioooonnnn ((((aaaaddddjjjjaaaacccceeeennnntttt rrrreeeeggggiiiioooonnnnssss
- aaaarrrreeee ccccooooaaaalllleeeesssscccceeeedddd iiiiffff tttthhhheeeeyyyy mmmmaaaapppp tttthhhheeee ssssaaaammmmeeee ffffiiiilllleeee ((((////ddddeeeevvvv////zzzzeeeerrrroooo)))) aaaannnndddd
- hhhhaaaavvvveeee tttthhhheeee ssssaaaammmmeeee aaaattttttttrrrriiiibbbbuuuutttteeeessss)))).... TTTThhhheeee ooooppppttttiiiimmmmiiiizzzzaaaattttiiiioooonnnn iiiissss vvvveeeerrrryyyy
- uuuusssseeeeffffuuuullll ffffoooorrrr XXXX sssseeeerrrrvvvveeeerrrrssss ((((aaaavvvvooooiiiiddddssss sssseeeeaaaarrrrcccchhhh ttttiiiimmmmeeee aaaaccccrrrroooossssssss lllloooottttssss ooooffff
- rrrreeeeggggiiiioooonnnnssss)))) oooonnnn wwwwoooorrrrkkkkssssttttaaaattttiiiioooonnnnssss bbbbuuuutttt aaaarrrreeee nnnnooootttt vvvveeeerrrryyyy uuuusssseeeeffffuuuullll ffffoooorrrr
- llllaaaarrrrggggeeee ccccoooommmmppppuuuutttteeee iiiinnnntttteeeennnnssssiiiivvvveeee mmmmaaaacccchhhhiiiinnnneeeessss.... TTTTuuuurrrrnnnniiiinnnngggg ooooffffffff tttthhhheeee
- ooooppppttttiiiimmmmiiiizzzzaaaattttiiiioooonnnn eeeennnnaaaabbbblllleeeessss pppprrrrooooggggrrrraaaammmmmmmmeeeerrrrssss ttttoooo ccccrrrreeeeaaaatttteeee mmmmuuuullllttttiiiipppplllleeee
- rrrreeeeggggiiiioooonnnnssss aaaaddddjjjjaaaacccceeeennnntttt ttttoooo eeeeaaaacccchhhh ooootttthhhheeeerrrr ((((aaaaddddddddrrrreeeessssssss ssssppppaaaacccceeee wwwwiiiisssseeee)))) ttttoooo
- aaaavvvvooooiiiidddd tttthhhheeee rrrreeeeggggiiiioooonnnn lllloooocccckkkk bbbboooottttttttlllleeeennnneeeecccckkkk....
-
- ++++oooo BBBBuuuugggg ####555500002222999999996666 :::: ppppaaaaggggeeee____ddddiiiissssccccaaaarrrrdddd nnnneeeeeeeeddddssss ttttoooo ssssuuuuppppppppoooorrrrtttt SSSSBBBBEEEE ppppaaaaggggeeee
- ddddiiiissssccccaaaarrrrddddiiiinnnngggg....
-
- In case of SBE memory errors, we would like to not
- reuse the page after it is freed up by the using
- processes but allow the current users to access the
- page while they have a reference to it. This is now
- supported.
-
- +o Rfe #502809 : NNNNeeeeeeeedddd nnnneeeewwww iiiinnnntttteeeerrrrffffaaaacccceeeessss ffffoooorrrr UUUUnnnniiiiCCCCeeeennnntttteeeerrrr CCCCAAAA
-
- This patch has some interfaces that are needed for CA-
- UniCenter.
-
- +o Bug #503126: Turned off promlogging to remote nodes on
- NI errors.
-
- +o Bug #504923 : Fix so diskless clients can boot (bug
- introduced in patch 1978).
-
- +o Bug #505685: BTE errors should dump hardware error
- state.
-
-
-
-
-
-
-
-
-
-
-
-
- - 3 -
-
-
-
- This was fixed by doing a dump of the hardware error
- state before panicking on the bte crb error. Also the
- panic message has been expanded to include relevant CRB
- information.
-
- +o Bug #506220 : idbg error on "vfs" command for DMAPI
- file system (e.g., file system mounted "-o dmi").
-
- +o Bug #706050 : CPU 48: KERNEL FAULT SOFTWARE DETECTED
- SEGV
-
- This was a problem where sigtosharegroup didn't have
- any locking against exiting sproc processes - thus an
- exiting process could call detachshaddr, setting
- p_shaddr to null, while the caller of sigtosharegroup
- was trying to use the p_shaddr field.
-
- +o Bug (unreported) : Optimal assignment of I/O boards to
- nodes was incorrect
-
- It was previously possible for the assignment of a node
- to control a given I/O board to be different from the
- documented assignment, due to an off-by-one error.
- This patch includes a fix that makes the assignment
- conform to documented assignments.
-
- +o Bug #453414: SysV semaphores - sempid wrong for
- pthreads
-
- The sempid field was incorrectly using the sproc PID
- instead of the shared process PID. For pthread apps
- this meant that sempid might not match getpid() even
- though only threads from the same process accessed the
- semaphore.
-
- +o Bug #501616: fo_scsi_lun_remove was not in the failover
- stubs module, requiring the inclusion of failover.o in
- diskless kernels.
-
- +o Bug #501507: Race condition in mon_trace_switch
-
- Fix a Race condition in mon_trace_switch(). Kernel
- cannot depend on the value of a variable read before
- grabbing the lock. The variable needs to be read again
- after grabbing the lock, and before derefencing it as a
- pointer.
-
- +o Bug #507073: MD Directory error register reporting is
- wrong
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 4 -
-
-
-
- Hub Memory interface error register bit field decoding
- was incorrect. Error dumping code was not decoding one
- of the field.
-
- Bugs fixed in Patch SG0002121:
-
- +o Bug #427056: vnode pcache race
-
- +o Bug #489537: gang scheduler hang
-
- +o Bug #491852: gang scheduler problem in patch 1978
-
- +o Bug #449470: prreaddir returns bad data if multiple
- pids go to same slot
-
- This lead to the possibility that ps or ls /proc may
- list incorrect data. If one was very unlucky, the bug
- could lead to stack corruption within the kernel, with
- the possibility of a resulting crash. This bug was
- never observed in the field, but was discovered by code
- inspection.
-
- +o Bug #483959: improved mlockall() handling with
- MCL_FUTURE flag
-
- Prevent a gfx application from mistakenly getting a
- SIGSEGV when using mlockall(3C) with the MCL_FUTURE
- flag.
-
- +o Bug #481501 : AW- reboot on the Octane & Onyx2 running
- MRed code
-
- +o Bug #486400 : ISV app crashes system
-
- +o Bug #486264 : kernel panic when runing frame4
-
- Each of these bugs resulted in a machine ASSERT failure
- with the following message: assertion failed cpu 0:
- (rp->r_refcnt > 1) || !(flags & RF_EXITING), file:
- ../os/region.c, line: 1006 This was caused by a bug in
- close-on-exec processing for sproc processes, and was
- in fact the same bug fix as for bug # 484611. Bug
- 484611 was fixed in kernel rollup patch 1978.
-
- +o Bug #491891: io_spunlock() needs to be improved for
- IP27
-
- io_spunlock() needs to make sure that the PIO
- operations launched by the processor holding the lock
- go in order before the lock is released. This fix
- forces a sync operation to force all PIO operations to
-
-
-
-
-
-
-
-
-
-
-
- - 5 -
-
-
-
- reach a hardware domain where PIOs are always in order.
-
- +o Bug #491895: Hub 2.1 workaround
-
- This is a workaround to reduce or eliminate cache
- interventions which helps to avoid hitting one of the
- problems in Hub 2.1
-
- +o Bug #494592: Better error message
-
- Error messages on a bus error were made more user
- friendly by including the module/slot information.
-
- +o Bug #497013: Cached read directory error
-
- Error message on a cached read directory error made
- more user friendly by including the module/slot
- information.
-
- +o Bug #497729: Disabling CPUs produces alarming message
- at boot
-
- Warning messages during volunteer-for-widget phase of
- xbow io initialization have been masked for headless
- nodes.
-
- +o Bug #500585: Wrong register is being read in router
- error state retrieval
-
- RR_PORT_PARMS and RR_STATUS_ERROR registers were being
- swapped while printing the router error state and this
- has been fixed.
-
- +o Bug #705897: ORIGIN PROGRAM FAILS WITH F77 7.2 USING
- -O3
-
- This was a bug in the floating point emulation code in
- the kernel. If a floating point exception is taken on
- an instruction in a branch delay slot, the kernel must
- emulate the branch in order to compute the proper
- program counter for the faulting program. The emulation
- code for the MIPS4 bc1t/bc1f family of instructions was
- incorrect, thus resulting in an incorrect program
- counter when the user program was restarted after the
- exception.
-
- Bugs fixed in Patch SG0001978:
-
- +o Bug #432166 : panic due to tlbmiss in trilevel_pte()
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 6 -
-
-
-
- +o Bug #433662: PPPPrrrroooocccceeeesssssssseeeessss CCCCaaaannnn HHHHaaaannnngggg oooonnnn IIIIssssoooollllaaaatttteeeedddd////RRRReeeessssttttrrrriiiicccctttteeeedddd
- PPPPrrrroooocccceeeessssssssoooorrrr....
-
- When a processor is isolated or restricted, usually as
- part of run real-time application, other processes
- which are not mustrun onto the isolated/restricted
- processor can be hung. This bug has been particularly
- observed while running Mediabase applications.
-
- +o Bug #458212 : ipcs doesn't report outstanding shared
- memory
-
- +o Bug #462005 : The attr_multi system call produced
- errors if the count of operations was greater than 1.
-
- +o Bug #463762 :
-
- Device interrupt allocation couldnot be done inspite of
- interrupt bits being available. This fixes a bug in the
- interrupt target selection process on a particular hub
- where only one cpu is enabled. Also the interrupt
- target selection algorithm is made more generic.
-
- +o Bug #464148 :
-
- In order to support extremely large I/O configurations,
- the number of hwgraph vertexes that the kernel can
- handle is now controlled through a static tunable in
- stune/kernel, "hwgraph_num_dev". The default value
- should be sufficient for the vast majority of
- installations.
-
- +o Bug #466601 : ssssbbbbrrrrkkkk ssssyyyysssstttteeeemmmm ccccaaaallllllll sssshhhhoooouuuulllldddd iiiinnnnccccrrrreeeeaaaasssseeee rrrreeeeggggiiiioooonnnn
- ssssiiiizzzzeeee bbbbaaaasssseeeedddd oooonnnn ppppaaaaggggeeee ssssiiiizzzzeeee
-
- This is a performance enhancement. It allows programs
- that do a lot of small mallocs (like C++ programs) to
- use large pages effectively.
-
- +o Bug #468034 :
-
- This patch allows independent processes to share the
- kernel data structures that describe their address
- space. These data structures are called Page Tables
- and contain information about the virtual to physical
- address translation. A big benefit of sharing Page
- Table is speed. In fact any new process attaching to
- the SHM segment benefits from the page faulting
- activity performed by other attached processes. This
- dramatically reduces the number of page faults and
- makes a great difference in the overall performance.
-
-
-
-
-
-
-
-
-
-
-
- - 7 -
-
-
-
- This patch is highly recommended for installation
- running large Oracle Data Bases. Processes that want
- to make use of this feature should specify a special
- flags when calling _s_h_m_a_t. This option is only available
- if both the attaching address and the size of the SHM
- segment satisfy appropriate restrictions. See _s_h_m_a_t(2)
- for detailed information.
-
- +o BUG #468287 :
-
- The kernel routine which allocated user virtual address
- space was very inefficient when there were a large
- number of mappings.
-
- +o Bug #468904 : WWWWeeeeiiiigggghhhhttttlllleeeessssssss pppprrrroooocccceeeesssssssseeeessss sssslllloooowwww ssssyyyysssstttteeeemmmm
- rrrreeeessssppppoooonnnnsssseeee iiiinnnn mmmmuuuullllttttiiiipppprrrroooocccceeeessssssssoooorrrr mmmmaaaacccchhhhiiiinnnneeeessss
-
- Weightless processes compete effectively with normal
- timesharing processes, causing erratic interactive
- behavior. This patch searches more extensively for
- time-sharing threads before running weightless threads.
-
- +o Bug #469295 : Kmem_zone_alloc() should take a policy
- parameter
-
- Zone allocator now accepts a parameter to indicate the
- radius of the search to get the memory for a zone
- request. This is useful to avoid zone size bloats when
- lots of processes are started and killed.
-
- +o Bug #472156 : par can hang system
-
- This moves the fawltysched() call down after the
- kthread is unlocked. Calling fawltysched() while the
- kthread is locked can lead to deadlocks
-
- +o Bug #473350 : Can't do copy-on-write from read-only
- vnode region
-
- +o Bug #473757 : ipcs does not report outstanding shared
- memory
-
- +o Bug #473776 : Large pages can cause crashes due to
- inconsistent PTEs
-
- The PM policy synch code did not check to make sure
- that the pte bits are consistent for all the base pages
- of the large page. In this case some of the ptes had
- the mod bit set. This caused a large page to be formed
- with some ptes having the mod bit set and some not
- having the bit set.
-
-
-
-
-
-
-
-
-
-
-
- - 8 -
-
-
-
- +o Bug #474576 : ipcs on 6.4 broken - duplicate of bug
- #473757.
-
- +o Bug #474898 : NLM cancel requests were not always
- properly honored.
-
- +o Bug #475414 : PIO errors during probing should not be
- reported
-
- +o Bug #475765 : DFS support needed to be added to the
- kernel.
-
- +o Bug #475913 : Coalesced performance improvements
-
- +o Bug #476706 : Panic messages need to be logged in the
- flashlog for IP27 systems.
-
- +o Bug #477990 :
-
- Fixed the chunk cache to free up clean memory more
- proactively instead of waiting until freemem gets
- really low.
-
- +o Bug #478654 :
- PPPPoooowwwweeeerrrr ffffaaaaiiiilllluuuurrrreeee ddddaaaattttaaaa ccccoooorrrrrrrruuuuppppttttiiiioooonnnn ((((nnnnooootttteeee:::: ppppoooossssssssiiiibbbblllleeee rrrreeeeaaaallll----
- ttttiiiimmmmeeee iiiimmmmppppaaaacccctttt))))
-
- Abrupt loss of AC power to an Origin or Onyx2 system
- during I/O operations may cause a small amount of
- corrupt data to be transmitted or committed to disk,
- which can be a fatal problem esp. in database
- applications. This workaround prevents this by
- immediately halting all I/O when the system controller
- (MSC) power failure early warning is detected.
-
- Impact on real-time system performance is possible with
- old MSCs; affected users may eliminate this possibility
- by changing the systune variable ignore_sysctlr_intr to
- 1 or replacing the older MSC.
-
- +o Bug #480640 ssssiiiiggggwwwwaaaaiiiitttt wwwwoooouuuulllldddd nnnnooootttt wwwwoooorrrrkkkk pppprrrrooooppppeeeerrrrllllyyyy wwwwiiiitttthhhh
- pppptttthhhhrrrreeeeaaaaddddssss pppprrrrooooggggrrrraaaammmmssss....
-
- A pthread that blocked a particular signal then
- attempted to wait for the signal via sigwait(3) or
- sigtimedwait(3) would not be notified of the signal's
- delivery.
-
- +o Bug #481414 : Reverse maps need to grow in smaller
- steps
-
-
-
-
-
-
-
-
-
-
-
-
- - 9 -
-
-
-
- The reverse map needed to grow in much smaller steps
- than it was. It was taking up too much memory in large
- memory machines if more than 15 processes share the
- memory. With much smaller steps the memory use came
- down from 1.4G to 295M.
-
- +o Bug #483044 : cache error type=interface messages are
- confusing
-
- In the case of "Type=Interface", we should not print
- the word "cache" at all. Instead, the message should
- say "System Interface Error" or "Memory Error".
-
- +o Bug #483048 :
-
- Fixed a bug where error_dump is not getting called on
- certain kinds "Kernel Data Bus Error" panics.
-
- +o Bug #483683 : TTTTiiiimmmmeeee----sssslllliiiicccceeee eeeennnndddd nnnnooootttt rrrreeeessssppppeeeecccctttteeeedddd oooonnnn CCCCCCCC----NNNNUUUUMMMMAAAA
- ssssyyyysssstttteeeemmmmssss....
-
- Memory affinity code on CC-NUMA systems overrides
- time-slice end, allowing processes to run for extended
- periods without rescheduling.
-
- +o Bug #483978 : OOOOrrrriiiiggggiiiinnnn////OOOOnnnnyyyyxxxx2222 vvvvmmmmeeee ssssuuuuppppppppoooorrrrtttt
-
- Fix edtinit path to allow Origin2000 VME devices to be
- probed and device driver loaded.
-
- +o Bug #484353 : Made sure global_buf_table points to
- initialized memory to avoid kernel panics while
- recycling a buffer.
-
- +o Bug #484611 : close-on-exec not handled properly for
- sproc processes
-
- A fix is included to properly close file descriptors
- marked as close-on-exec. Previously, they were not
- properly closed for sproc processes that exec'ed.
- Detected as several sites that tried to run gaussian.
-
- +o Bug #484659 : Race condition in trilevel_pte
-
- There was a race condition in trivel_pte which caused
- the segtable to freed twice.
-
- +o Bug #484690 :
-
- Added single-bit ECC error monitoring features. This
- allows the detection of stuck data lines that may
-
-
-
-
-
-
-
-
-
-
-
- - 10 -
-
-
-
- otherwise go unnoticed because they are transparently
- corrected as single-bit errors.
-
- +o Bug #484698 : debug() should check to make sure kdebug
- is set before trapping
-
- We should not attempt to call the debugger if it isn't
- loaded.
-
- +o Bug #484708 :
-
- Added board serial numbers in hardware error state.
-
- +o Bug #484714 :
-
- Fixed sending of panic interrupts to the rest of the
- cpus from the cpu which is handling an nmi.
-
- +o Bug #485110 : OOOOvvvveeeerrrrllllaaaappppppppiiiinnnngggg mmmmeeeemmmmoooorrrryyyy ppppllllaaaacccceeeemmmmeeeennnntttt ffffoooorrrr
- mmmmuuuullllttttiiiipppplllleeee ppppaaaarrrraaaalllllllleeeellll jjjjoooobbbbssss....
-
- Multiple parallel jobs often get placed on nodes which
- are already in use even when there are free nodes
- available. This bug can dramatically decrease
- perfomance for large throughput runs which include
- multiple parallel jobs.
-
- +o Bug #485318: BTE disabling information should be made
- more user friendly.
-
- Change message that gets printed when a BTE gets
- disabled to go to console buffer. Also indicate it will
- be restarted when system reboots. Make it a notice
- instead of warning.
-
- +o Bug #489412 : sssshhhhmmmmggggeeeetttt ffffaaaaiiiillllssss wwwwhhhheeeennnn ssssiiiizzzzeeee >>>> 2222GGGGBBBB uuuussssiiiinnnngggg 66664444bbbbiiiitttt
- AAAABBBBIIII
-
- Correct data types so that the kernel now honors the
- creation of large shared memory areas specified with
- 64bit sizes.
-
- +o Bug #490636:
-
- Extract the correct serial number from the nic
- information in case of multiple nic information entries
- being stored for a single node board.
-
- +o Bug #492365: curaspm() macro should return proper
- pointer.
-
-
-
-
-
-
-
-
-
-
-
-
- - 11 -
-
-
-
- Fix the way aspm pointer was returned to the caller.
-
- +o Bug #704587 : SSSSwwwwaaaapppp aaaannnndddd dddduuuummmmpppp ddddeeeevvvviiiicccceeeessss ccccoooouuuulllldddd nnnnooootttt bbbbeeee
- ssssppppeeeecccciiiiffffiiiieeeedddd ooooffffffff ooooffff tttthhhheeee rrrrooooooootttt ddddiiiisssskkkk....
-
- Previously, the kernel attempted to open the swap and
- dump devices early in the boot sequence, when only the
- root device was in the hardware graph. With this
- patch, non-default swap and dump devices are set up
- after the hardware graph is fully initialized. Specify
- these devices as full pathnames, for example,
- /dev/dsk/dks0d2s1. NOTE: besides this patch, a
- separate patch to /sbin/ioconfig is required to use
- non-default swap and dump devices.
-
- +o Kernel fixes that enable patch 1992 to fix the ipcs
- command and address problems with SysV shm reporting.
- Note that the fixes in this patch don't actually fix
- the problems (reported in 458212, 473757, 474576).
- This patch satisfies the kernel prerequisites for patch
- #1992, which fixes those problems.
-
- +o Bug #500607: Origin low-level interrupt code fails to
- handle NULL dev_desc
-
- Origin systems now correctly accept a NULL dev_desc
- parameter in calls to *_intr_alloc. The result will be
- a threaded interrupt handler, the same as if the
- default dev_desc for the device had been passed in.
-
- Bugs fixed in Patch SG0002056:
-
- +o Bug #477391 : New ioctl PIOCGETINODE for /proc to get
- inode information about a debugged process' files
-
- Bugs fixed in Patch SG0001856:
-
- +o AAAAddddddddeeeedddd ssssuuuuppppppppoooorrrrtttt ffffoooorrrr nnnneeeewwww IIIIPPPP22229999 bbbbooooaaaarrrrdddd
-
- A change in the physical IP29 board required kernel
- support. Boards with part number 030-1244-001 are
- supported by this patch.
-
- +o Bug #473951 : OOOOnnnn OOOOCCCCTTTTAAAANNNNEEEE,,,, iiiimmmmpppprrrroooovvvveeee ppppeeeerrrrffffoooorrrrmmmmaaaannnncccceeee wwwwhhhheeeennnn
- cccchhhheeeecccckkkkiiiinnnngggg CCCCPPPPUUUU ssssttttaaaattttuuuussss....
-
- Use cached variable to determine whether a cpu is
- enabled or not instead of doing 2 pio reads to heart;
- fix loop that calculating maxcpus.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 12 -
-
-
-
- +o Bug #472570 : race in early bootup affects small
- machines
-
- +o Bug #472381 :
- BBBBuuuugggg iiiinnnn ppppaaaaggggeeee ffffaaaauuuulllltttt hhhhaaaannnnddddlllleeeerrrr
-
- Kernel would panic in vfault when faulting in a demand
- zero fill page due to an invalid attribute structure
- reference. The attribute structure was becoming
- invalid due to a temporary release of the region lock
- while zeroing out a page, in order to increase
- parallelism. This bug had a high probability of
- occurrence when running highly multithreaded
- applications, specially when portions of the shared
- address space were being pinned.
-
- Another manifestation of this bug was an application
- hanging in an unkillable state.
-
- +o Bug #472362 :
-
- Read the corresponding int pend registers after
- clearing the interrupt to avoid a race where the bit
- gets cleared much later causing us to lose interrupts.
-
- +o Bug #472121 :
-
- There is a race in the hardware error saving code that
- cause the FRU to give a bogus analysis if we get a
- cache error while we are saving the error state and
- panicing.
-
- +o Bug #472041 : Added support to turn off bypassing in
- the router on IP27
-
- +o Bug #471664 : Nsort program crashes while using shared
- memory
-
- +o Bug #471654 : MMMMeeeemmmmoooorrrryyyy eeeerrrrrrrroooorrrrssss ccccaaaannnn ggggoooo uuuunnnnrrrreeeeccccoooorrrrddddeeeedddd dddduuuueeee ttttoooo
- ssssppppeeeeccccuuuullllaaaattttiiiioooonnnn
-
- Multiple uncorrectable errors could cause the md error
- register to be set due tospeculation on the local node.
- However software has no indication of this since we
- don't see an interrupt or cache errors. When we do get
- the real error on another page, the error register
- still holds the first error and the multiple error bit
- gets set in the register.
-
- Since the error address does not match the address in
- the register, the page does not get discarded. This
-
-
-
-
-
-
-
-
-
-
-
- - 13 -
-
-
-
- allows the page to get reused and we finally panic but
- since the bad address is not logged anywhere, we cannot
- reportthe error correctly.
-
- +o Bug #471021 : CCCCoooorrrrrrrreeeeccccttttiiiioooonnnn ttttoooo FFFFeeeettttcccchhhh++++OOOOpppp ccccaaaacccchhhheeee fffflllluuuusssshhhhiiiinnnngggg
-
- Fetch+Op cache needs to be flushed when the page that's
- allocated for Fetch+Op operation is being freed.
- Reusing this page without flushing could lead to
- problems.
-
- +o Bug #470333 : IIIImmmmpppprrrroooovvvveeeemmmmeeeennnntttt ooooffff mmmmeeeemmmmoooorrrryyyy eeeerrrrrrrroooorrrr mmmmeeeessssssssaaaaggggeeeessss oooonnnn
- IIIIPPPP33330000....
-
- +o FFFFiiiixxxx cccchhhheeeecccckkkkiiiinnnngggg ooooffff uuuunnnniiiiqqqquuuueeee iiiidddd ((((uuuuuuuuiiiidddd)))) One case of unique id
- (uuid) comparison in the kernel was incorrect; also the
- error codes returned for different flavors of invalid
- uuids were not in compliance with the DCE
- specification.
-
- +o Bug #467176 : OOOOnnnn IIIIPPPP22227777 ssssyyyysssstttteeeemmmmssss,,,, tttthhhheeee kkkkeeeerrrrnnnneeeellll mmmmaaaayyyy ppppaaaannnniiiicccc
- wwwwiiiitttthhhh CCCCrrrraaaayyyyLLLLiiiinnnnkkkk nnnneeeettttwwwwoooorrrrkkkk ttttiiiimmmmeeeeoooouuuutttt mmmmeeeessssssssaaaaggggeeeessss....
-
- For IP27 systems, the aging of messages which
- facilitates message delivery without starvation was not
- setup right. This could cause the machine to panic
- since some messages timeout after being starved for a
- long time. This bug especially effects configurations
- with a large number of cpus. This bug has been fixed in
- this patch.
-
- +o Bug #465295 : IIIImmmmpppprrrrooooppppeeeerrrr ccccaaaallllccccuuuullllaaaattttiiiioooonnnn ooooffff ssssttttaaaarrrrttttiiiinnnngggg vvvviiiirrrrttttuuuuaaaallll
- aaaaddddddddrrrreeeessssssss
-
- Kernel fault when running a third-party data-mining
- application.
-
- +o Bug #466237 : FFFFiiiixxxx ttttoooo ssssyyyysssstttteeeemmmm ccccaaaallllllll bbbbuuuugggg tttthhhhaaaatttt mmmmaaaayyyy ccccaaaauuuusssseeee aaaa
- ssssyyyysssstttteeeemmmm ppppaaaannnniiiicccc iiiinnnn ssssyyyyssssssssggggiiii((((2222)))) uuuussssiiiinnnngggg SSSSGGGGIIII____RRRRTTTT____TTTTSSSSTTTTAAAAMMMMPPPP____UUUUPPPPDDDDAAAATTTTEEEE
-
- ++++oooo BBBBuuuugggg ####444466665555000022225555 :::: FFFFiiiixxxx ttttoooo ssssyyyysssstttteeeemmmm ccccaaaallllllll bbbbuuuugggg tttthhhhaaaatttt mmmmaaaayyyy ccccaaaauuuusssseeee aaaa
- ssssyyyysssstttteeeemmmm ppppaaaannnniiiicccc iiiinnnn sssseeeettttccccoooonnnntttteeeexxxxtttt((((2222))))....
-
- +o Bug #465061 : FFFFiiiixxxx ttttoooo ssssyyyysssstttteeeemmmm ccccaaaallllllll bbbbuuuugggg tttthhhhaaaatttt mmmmaaaayyyy ccccaaaauuuusssseeee aaaa
- ssssyyyysssstttteeeemmmm ppppaaaannnniiiicccc iiiinnnn ssssyyyyssssssssggggiiii((((2222)))) uuuussssiiiinnnngggg SSSSGGGGIIII____SSSSPPPPRRRROOOOFFFFIIIILLLL aaaassss tttthhhheeee
- rrrreeeeqqqquuuueeeesssstttt....
-
- +o Bug #464708 : FFFFiiiixxxx ssssoooo ddddiiiisssskkkklllleeeessssssss cccclllliiiieeeennnnttttssss ccccaaaannnn bbbbooooooootttt
-
- +o Bug #464517 : BBBBuuuugggg iiiinnnn kkkkeeeerrrrnnnneeeellll''''ssss eeeemmmmuuuullllaaaatttteeee____bbbbrrrraaaannnncccchhhh ccccooooddddeeee....
-
-
-
-
-
-
-
-
-
-
-
-
- - 14 -
-
-
-
- This scenario can happen whenever there is a floating
- point instruction in the shadow of one of these
- branches. Found because the exponential function in
- libfastm was sometimes failing.
-
- Bugs fixed in Patch SG0001954:
-
- +o Bug #470142 : ppppaaaannnniiiicccc dddduuuueeee ttttoooo nnnnuuuullllllll pppp---->>>>pppp____sssshhhhaaaaddddddddrrrr iiiinnnn
- iiiirrrriiiixxxx5555____pppprrrrggggeeeettttppppssssiiiinnnnffffoooo(((())))
-
- Bugs fixed in Patch SG0001815:
- Bug #463622 :
-
- +o DDDDeeeevvvviiiicccceeee ddddrrrriiiivvvveeeerrrrssss ttttrrrryyyyiiiinnnngggg ttttoooo mmmmaaaapppp kkkkeeeerrrrnnnneeeellll mmmmeeeemmmmoooorrrryyyy ttttoooo uuuusssseeeerrrr
- aaaaddddddddrrrreeeessssssss ssssppppaaaacccceeee ccccoooouuuulllldddd ppppaaaannnniiiicccc tttthhhheeee ssssyyyysssstttteeeemmmm
-
- Kernel would panic in spec_unmap() routine when a user
- level process tries to invoke a mmap(2) system call to
- their device driver. Problem was, driver was asking
- kernel to allocate memory. In response kernel would
- return an address in kernel virtual memory space
- (XKSEG). Driver would then try to map this address to
- user address space. The interface to do this mapping,
- was incorrectly checking this kernel virtual address
- range, and would end up returning an error for the
- mapping. In the error return path for the mmap(2)
- system call, this would cause some problem, and we
- would end up causing the above panic.
-
- This bug would be triggered only if device drivers try
- to allocate kernel memory greater than a single page
- size (16Kbytes).
-
- +o Bug #460221 : SSSSyyyysssstttteeeemmmmssss wwwwiiiitttthhhh jjjjuuuusssstttt oooonnnneeee rrrruuuunnnnnnnniiiinnnngggg pppprrrroooocccceeeessssssssoooorrrr
- wwwwoooouuuulllldddd ccccaaaauuuusssseeee hhhhaaaannnnggggssss....
-
- This bug would get triggered only on systems with one
- processor. In these systems, the utlbmiss code path for
- single cpu Origin 2000 and Origin 200 was broken since
- the functions that selected and removed the
- (switchable) utblmiss handlers were not consistent.
-
- That is, for single cpu origins, the utlbmiss_resume
- always patches the utlbmiss code in one way, whereas
- utlbmiss_reset does not undo the patch correctly.
- Always using the mp case for origins fixes the problem.
-
- +o Bug #463665 : IIIIssssoooollllaaaattttiiiinnnngggg pppprrrroooocccceeeessssssssoooorrrrssss oooonnnn OOOOrrrriiiiggggiiiinnnn ssssyyyysssstttteeeemmmmssss
- ccccaaaauuuusssseeeessss kkkkeeeerrrrnnnneeeellll ppppaaaannnniiiicccc....
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 15 -
-
-
-
- When a processor is isolated, code in locore attempts
- to update the p_kvfault array which contains a bit for
- each kernel virtual address and indicates that the
- processor faulted on that XKSEG address. This is
- needed since isolated processors do NOT have their tlbs
- synced with the other processors unless they have
- faulted on the addresses being freed.
-
- When the kernel became mapped the first 32 MB of XKSEG
- space was removed from the sptmap but was left in the
- kptbl. This requires us to bias the address used in
- accessing the p_kvfault array.
-
- +o Bug #484706 : SSSSyyyysssstttteeeemmmm IIIInnnntttteeeerrrrffffaaaacccceeee EEEErrrrrrrroooorrrr RRRReeeeppppoooorrrrtttteeeedddd aaaassss CCCCaaaacccchhhheeee
- EEEErrrrrrrroooorrrr....
-
- System interface errors in the R10000 are reported
- through the cache error register, and were thus
- reported by the kernel as cache errors (which is
- misleading). System interface error messages are now
- printed in these cases.
-
- +o Bug #483230 : Made an optimization in the code dealing
- with shaddr sproc processes which are being debugged
- and had locked instruction pages. Each sproc being
- debugged would get its own copy of all the locked
- instruction pages, leading to bloat. This page copying
- has been minimized so that only the page which is being
- modified by the debugger (eg for setting breakpoints)
- will be made private to the target sproc.
-
- +o Bug #496469 : hhhhuuuubbbbddddeeeevvvv____ccccaaaalllllllloooouuuuttttssss ccccaaaannnn hhhhaaaannnngggg aaaa mmmmaaaacccchhhhiiiinnnneeee....
-
- hubdev_callouts holds a spinlock and calls functions
- which could in some rare cases go to sleep. This was
- observed mainly on systems where both cpus on a node
- board had been disabled. The fix has been to change the
- spinlock to a mutex.
-
- +o Bug #484928 : NNNN33332222 pppprrrrooooggggrrrraaaammmm ccccaaaauuuusssseeeessss mmmmaaaacccchhhhiiiinnnneeee nnnnooootttt ttttoooo mmmmaaaakkkkeeee
- pppprrrrooooggggrrrreeeessssssss....
-
- There are cases where a cpu appears not to make
- progress while running a program. There now exists a
- periodic check from each cpu whether forward progress
- is being made by all the other cpus. Appropriate action
- is taken if any cpu does not seem to be making
- progress.
-
- +o Bug #504612 : ////pppprrrroooocccc ppppssssiiiinnnnffffoooo ggggiiiivvvveeeessss wwwwrrrroooonnnngggg sssscccchhhheeeedddd ccccllllaaaassssssss....
-
-
-
-
-
-
-
-
-
-
-
-
- - 16 -
-
-
-
- Correct reporting of scheduling classes caused by
- incorrect check of PR_SPID.
-
- +o Bug #506980 : NNNN oooonnnn NNNN ppppeeeerrrrffffoooorrrrmmmmaaaannnncccceeee ddddeeeeggggrrrraaaaddddaaaattttiiiioooonnnn....
-
- In these cases parallel jobs can get placed such that
- memories get reused causing extreme slowdown when used
- in conjunction with mustrun.
-
- 1.4 _S_u_b_s_y_s_t_e_m_s__I_n_c_l_u_d_e_d__i_n__P_a_t_c_h__S_G_0_0_0_2_2_1_1
-
- This patch release includes these subsystems:
-
- +o patchSG0002211.dev_man.irix_lib
-
- +o patchSG0002211.eoe_hdr.lib
-
- +o patchSG0002211.eoe_man.unix
-
- +o patchSG0002211.eoe_sw.kdebug
-
- +o patchSG0002211.eoe_sw.unix
-
-
- 1.5 _I_n_s_t_a_l_l_a_t_i_o_n__I_n_s_t_r_u_c_t_i_o_n_s
-
- Because you want to install only the patches for problems
- you have encountered, patch software is not installed by
- default. After reading the descriptions of the bugs fixed
- in this patch (see Section 1.3), determine the patches that
- meet your specific needs.
-
- If, after reading Sections 1.1 and 1.2 of these release
- notes, you are unsure whether your hardware and software
- meet the requirements for installing a particular patch, run
- _i_n_s_t. The _i_n_s_t program does not allow you to install
- patches that are incompatible with your hardware or
- software.
-
- Patch software is installed like any other Silicon Graphics
- software product. Follow the instructions in your _S_o_f_t_w_a_r_e
- _I_n_s_t_a_l_l_a_t_i_o_n _A_d_m_i_n_i_s_t_r_a_t_o_r'_s _G_u_i_d_e to bring up the miniroot
- form of the software installation tools.
-
- Follow these steps to select a patch for installation:
-
- 1. At the Inst> prompt, type
-
- iiiinnnnssssttttaaaallllllll ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 17 -
-
-
-
- where _x_x_x_x_x_x_x is the patch number.
-
- 2. Initiate the installation sequence. Type
-
- IIIInnnnsssstttt>>>> ggggoooo
-
- 3. You may find that two patches have been marked as
- incompatible. (The installation tools reject an
- installation request if an incompatibility is
- detected.) If this occurs, you must deselect one of
- the patches.
-
- IIIInnnnsssstttt>>>> kkkkeeeeeeeepppp ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
-
- where _x_x_x_x_x_x_x is the patch number.
-
- 4. After completing the installation process, exit the
- _i_n_s_t program by typing
-
- IIIInnnnsssstttt>>>> qqqquuuuiiiitttt
-
-
-
- 1.6 _P_a_t_c_h__R_e_m_o_v_a_l__I_n_s_t_r_u_c_t_i_o_n_s
-
- To remove a patch, use the _v_e_r_s_i_o_n_s _r_e_m_o_v_e command as you
- would for any other software subsystem. The removal process
- reinstates the original version of software unless you have
- specifically removed the patch history from your system.
-
- vvvveeeerrrrssssiiiioooonnnnssss rrrreeeemmmmoooovvvveeee ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
-
- where _x_x_x_x_x_x_x is the patch number.
-
- To keep a patch but increase your disk space, use the
- _v_e_r_s_i_o_n_s _r_e_m_o_v_e_h_i_s_t command to remove the patch history.
-
- vvvveeeerrrrssssiiiioooonnnnssss rrrreeeemmmmoooovvvveeeehhhhiiiisssstttt ppppaaaattttcccchhhhSSSSGGGG_x_x_x_x_x_x_x
-
- where _x_x_x_x_x_x_x is the patch number.
-
- 1.7 _K_n_o_w_n__P_r_o_b_l_e_m_s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-